Predicting Indexer Performance in a Distributed Digital Library
نویسندگان
چکیده
Resource discovery in a distributed digital library poses many challenges, one of which is how to choose search engines for query distribution, given a query and a set of search engines. This paper focuses on search engine performance as a criterion for search engine selection and defines two measurements of search engine performance: availability – will the search engine respond within a time limit and response time – how quickly will the search engine respond, given that it responds at all. We predicted both of these performance characteristics with a variety of algorithms, all of which required little computation time and combined past performance data for each search engine into a succinct record. We used operational data from the NCSTRL distributed digital library to make and evaluate predictions, and we found that simple prediction methods performed as well as more complex methods and that prediction accuracy was closely related to data consistency.
منابع مشابه
A Characterization Study of NCSTRL Distributed Searching
NCSTRL, the Networked Computer Science Technical Reference Library, is a federated digital library based on the Dienst architecture. One aspect of this architecture is distributed searching, with digital library queries being dispatched from query routers to globally distributed indexers that process them and return results. We studied user data for a two-month period at five query routers in o...
متن کاملVirtual Digital Library
DIGMAP is a digital library specialized in searching and browsing services for old maps and related resources. The service reuses metadata from national libraries and other relevant third party metadata sources, providing added value services by aggregating all the data in comprehensive collections, browsing indexes and search functions. The services are based in a set of specialized tools, com...
متن کاملDevelopment of Quality Performance of National Digital Library with Kano's Model Approach
Background and Aim: The purpose of this study is to determine the quality requirements of the National Digital Library based on the Kano model and categorize users needs into three groups of: Basic, functional and motivational. Methods: This survey was conducted with a qualitative approach. The requirements of the digital library were extracted using two standards: "Digiqual manual" and the "D...
متن کاملSemi automatic indexing of PostScript files using Medical Text Indexer in medical education.
At Albert Einstein College of Medicine a large part of online lecture materials contain PostScript files. As the collection grows it becomes essential to create a digital library to have easy access to relevant sections of the lecture material that is full-text indexed; to create this index it is necessary to extract all the text from the document files that constitute the originals of the lect...
متن کاملToward conceptual indexing using automatic assignment of descriptors
Indexing techniques have reached a well maturated state. Digital libraries and other digital collections make an intense use of these algorithms to store and retrieve documents. In the other side, we have browsing techniques, which lets the user to gather the information. Current approaches are not yet advanced enough in order to satisfy the user. At CERN we are working in a indexer based on th...
متن کامل